首页> 外文OA文献 >Capturing spatial interdependence in image features: the counting grid, an epitomic representation for bags of features
【2h】

Capturing spatial interdependence in image features: the counting grid, an epitomic representation for bags of features

机译:捕获图像特征中的空间相互依赖性:计数网格,   包含特征的表现形式

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In recent scene recognition research images or large image regions are oftenrepresented as disorganized "bags" of features which can then be analyzed usingmodels originally developed to capture co-variation of word counts in text.However, image feature counts are likely to be constrained in different waysthan word counts in text. For example, as a camera pans upwards from a buildingentrance over its first few floors and then further up into the sky Fig. 1,some feature counts in the image drop while others rise -- only to drop againgiving way to features found more often at higher elevations. The space of allpossible feature count combinations is constrained both by the properties ofthe larger scene and the size and the location of the window into it. Tocapture such variation, in this paper we propose the use of the counting gridmodel. This generative model is based on a grid of feature counts, considerablylarger than any of the modeled images, and considerably smaller than the realestate needed to tile the images next to each other tightly. Each modeled imageis assumed to have a representative window in the grid in which the featurecounts mimic the feature distribution in the image. We provide a learningprocedure that jointly maps all images in the training set to the counting gridand estimates the appropriate local counts in it. Experimentally, wedemonstrate that the resulting representation captures the space of featurecount combinations more accurately than the traditional models, not only whenthe input images come from a panning camera, but even when modeling images ofdifferent scenes from the same category.
机译:在最近的场景识别研究中,图像或较大的图像区域通常被表示为特征的杂乱无章的“包”,然后可以使用最初为捕获文本中字数的共变而开发的模型来对其进行分析。但是,图像特征数可能会受到不同形式的限制文字中比字数重要的方式。例如,当摄像机从建筑物的入口向上平移到其最初的几层,然后进一步向上进入天空时(图1),某些特征会在图像下降中计数,而其他特征会上升-只是再次下降,从而提供了更多在以下位置发现的特征更高的海拔。所有可能的特征计数组合的空间都受较大场景的属性以及窗口进入其中的大小和位置的限制。为了捕获这种变化,本文提出了计数网格模型的使用。此生成模型基于特征计数的网格,比任何建模的图像都要大得多,并且比将图像紧紧平铺所需的空间要小得多。假定每个建模图像在网格中都有一个代表性的窗口,其中特征计数模拟图像中的特征分布。我们提供了一个学习过程,将训练集中的所有图像共同映射到计数网格,并估计其中的适当局部计数。在实验上,我们证明,与传统模型相比,不仅在输入图像来自摇摄相机时,甚至在建模同一类别不同场景的图像时,所得到的表示都比传统模型更准确地捕获了特征计数组合的空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号